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Abstract 

Objectives: To show why and how the Hamilton Rating Scale for Depression became the 'Gold 
Standard' for assessing therapies from the mid- 1 960s and how it was used to frame depression as a 
short-term and curable illness rather than a chronic one. 

Methods: My approach is that of the social construction of knowledge, identifying the interests, 
institutional contexts and practices that produce knowledge claims and then mapping the social 
processes of their circulation, validation and acceptance. 

Results: The circulation and validation of Hamilton Rating Scale for Depression was relatively slow 
and it became a 'Gold Standard' 'from below', from an emerging consensus amongst psychiatrists 
undertaking clinical trials for depression, which from the 1960s were principally with 
psychopharmaceuticals for short-term illness. Hamilton Rating Scale for Depression, drug trials 
and the construction of depression as non-chronic were mutually constituted. 
Discussion: Hamilton Rating Scale for Depression framed depression and its sufferers in new 
ways, leading psychiatrists to understand illness as a treatable episode, rather than a life course 
condition. As such, Hamilton Rating Scale for Depression served the interests of psychiatrists and 
psychiatry in its new era of drug therapy outside the mental hospital. However, Hamilton Rating 
Scale for Depression was a strange kind of 'standard', being quite non-standard in the widely 
varying ways it was used and the meanings given to its findings. 
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Introduction 

There has been much discussion in recent 
years about whether depression is a chronic 
illness against the modern view that it is 
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typically time-limited. 1 Gask dated the 
growing dominance of this view to the 
1980s and 'the launch and promotion of a 
new group of antidepressants, the selective 
serotonin reuptake inhibitors (SSRIs)'. 2 The 
traditional view of depressive illness, from 
melancholia in the nineteenth century to 
Kraepelin's characterisation of manic 
depression that dominated twentieth-cen- 
tury psychiatry, was that the illness was 
recurrent, chronic or both. Sufferers could 
spend years in mental hospitals, where, from 
the 1950s, they might receive regular electro- 
convulsive therapy (ECT). The changes in 
the last quarter of the twentieth century are 
well known and recognised as revolutionary 
at all levels: definitions of 'depression' and 
the impact of DSM-III; the treatment of 
choice shifting from ECT to drugs; the 
closure of long-stay hospitals and the devel- 
opment of community care where sufferers 
from depression are mainly treated by gen- 
eral practitioners. The impact of these 
changes on medical views of depression 
was evident in an Editorial in Psychological 
Medicine in 2012, which had to remind 
readers of new evidence that amongst 
patients diagnosed with depression, only 
half had a single episode and half had a 
recurrent and chronic life-long illness. 3 The 
authors argued that more effort should now 
be given to identifying recurrence, with a 
view to altering 'the trajectory of depression 
that is so chronic, severe and disabling' for 
'the betterment of so very many'. 



factors. In this article, I investigate the 
longer-term origins in ways that depression 
was framed by psychiatrists through the 
impact of the Hamilton Rating Scale for 
Depression (HRSD), which from the 1970s 
became, and to a large extent remains, 
the dominant tool in assessing the severity 
of depression. A key feature of HRSD 
was that it was used to measure the outcome 
of treatment, especially drugs, and was 
applied as a 'before and after' schema, 
leading to the view that depression was 
event, thereby downplaying seriality. My 
argument also offers a case study of the 
impact of standard scales in medicine, and 
the interaction of drug standards and stand- 
ard drugs. 

My methods are those of the social 
construction of knowledge, explaining how 
ways of knowing and practising are formu- 
lated in specific social contexts, then circu- 
lated and validated in contingent settings 
by a variety of actors. Constructivist histor- 
ical methods were applied to articles 
and books that discussed the application 
of HRSD to various patient groups in 
hospital and community setting from the 
1960s to the late 1970s. Sources were 
identified from the standard online 
databases — Pubmed (keyword) and Science 
Direct (full text) — and quantitative indica- 
tors were derived from Web of Science. 
Detailed qualitative analysis of selected art- 
icles was also made, using close reading to 
identify the assumptions and modes of ana- 
lysis of the authors. 



Methods 

My principal research question is when and 
how did the view that depression was typic- 
ally time-limited and non-chronic originate? 
Was it in the 1980s and early 1990s with the 
arrival of SSRIs? These drugs were 
undoubtedly important, but so too were 
the changes in service provision and a host 
of other patient, professional and other 



Results 

The 21 -item HRSD for assessing the severity 
of depression was developed by the English 
psychiatrist Max Hamilton and presented to 
the psychiatric community in 1960 in the, 
then somewhat obscure, Journal of 
Neurology, Neurosurgery and Psychiatry. 4 
Interviewed in 1982, Hamilton observed 
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that, after completing a number of clinical 
trials on new drugs, 

I was also interviewing people about my 
depression scale and trying to see if I could 
get some work going on depression. I went 
around with my scale and it created a 
tremendous wave of apathy. They all 
thought I was a bit mad. Eventually I got 
it published in the Journal of Neurology, 
Neurosurgery and Psychiatry. It was the 
only one that would take it. 

He took some pleasure in adding that, 
'And now everyone tells me the scale is 
wonderful, I always remember when it had a 
different reception. This makes sure I don't 
get a swollen head.' Whether the last point 
was accurate is open to debate, as Hamilton 
was quite a domineering figure, but there is 
no doubt that his rating scale was, and still 
is, widely used. It has earned the title of the 
'Gold Standard' for the assessment of 
depression, though its reign may now be 
limited. 6 Given its status and influence, it is 
surprising that it has not been subject of 
historical enquiry and even authors who are 
critical of modern psychiatry and its 'man- 
ufacturing of depression' have not subjected 
it to scrutiny. 7 

There are two explanations of its domin- 
ance, both of which have some merit but are 
not the whole story. The first, which is 
common amongst psychiatrists, is that 
HRSD became the 'Gold Standard' simply 
by being the earliest scale to enjoy wide- 
spread use. However, it was born into a 
world of already competing scales, so the 
key question to answer is, why and how did 
it see off its rivals? Interestingly, Hamilton's 
Anxiety Scale, which was actually published 
before HRSD and hence was more of a 
'first,' did not endure. The second explan- 
ation is that HRSD was ideally suited to 
measure the effects of drug treatments, 
especially tricyclics such as imipramine, 
which were 'somewhat anxiolytic and some- 
what sedative in effect.' 8 HRSD scored for 



sleep and for weight gain, which were 
known to be affected by tricyclics. In other 
words and to quote one reviewer of The 
Antidepressant Era, 'The early drugs defined 
the very scale that was used to measure their 
performance.' 9 One recent critic of the scale 
wrote that Hamilton 'fashioned his test to 
meet the needs of his drug company 
patrons.' 7 Healy says that there is no evi- 
dence that Hamilton used his own scale in 
clinical practice, but then it was a research 
rather than clinical tool, designed to quan- 
tify changes in a patient's condition over 
time. 10 It is unclear whether Hamilton had 
direct 'drug company patrons,' though he 
was the founding President of the British 
Association of Psychopharmacology and an 
early member of the International College of 
Neuro-Psychopharamcology (CINP), which 
since his death in 1988 has awarded an 
annual prize in his name. On the other hand, 
Hamilton is widely described as an icono- 
clast and seems to have been a socialist; he 
was certainly a strong defender of the 
National Health Service in the 1980s when 
it was under threat from Thatcher era cuts in 
public spending. What is clear is that in the 
late 1950s and early 1960s Hamilton had 
many motives and that his abrasive charac- 
ter meant that pleasing anyone was not high 
on the list. 

In this article, I argue that the dominance 
of HRSD was only slowly achieved and that 
in its first two decades it had many rivals and 
that no one was more surprised than 
Hamilton himself that it proved to be so 
successful. Also, its dominance was largely 
in clinical research, translating trial findings, 
quite often, into simple before-and-after 
scores. There was an inherent bias to con- 
sider depression as time-limited and all the 
more so as a result of drug treatment. 
Hamilton created the scale to enable 
psychiatrists to chart changes in already 
diagnosed patients through particular 
treatment regimes, converting qualitative 
judgments into quantitative data on a 
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fine-grained 100-point scale. The scale also 
allowed psychiatrists to determine what the 
most significant changes were in an array of 
symptoms; though as I will show, most early 
studies used the aggregated scores rather 
than disaggregated data. Indeed, studies in 
the 1980s demonstrated that the schema was 
modified promiscuously, with psychiatrists 
adding and subtracting items to assess. 11 In 
1990, Zitman et al. surveyed five major 
journals over a year for research papers 
using the HAM-D and asked authors of for 
a copy of the scale they used. Fewer than 
half the investigators referenced the correct 
version of the HAM-D, and only 4 out of 51 
responders used versions that were the same 
as a published version. 

HRSD was not designed as a diagnostic 
schema, though many used it as such and 
one reason for its success was that its 
approach anticipated the emphasis of symp- 
toms and disease entities enshrined in DSM- 
III in 1980. 12 Although invented well before 
even DSM-II (1968), Hamilton's scale was 
for a specific condition and proposed stand- 
ardisation around overt symptoms, the fea- 
tures that distinguished the third from the 
second version of the DSM. Shaped by the 
assumptions of dominant psychodynamic 
approaches, DSM-I and -II had 'conceived 
of symptoms as reflections of broad under- 
lying dynamic conditions that only 

became meaningful through exploring the 
personal history of each individual'. 12 
Influenced strongly by Karl Menninger's 
assumption that all mental disorders were 
reducible to 'the failure of the suffering 
individual to adapt to his or her environ- 
ment', psychiatrists tended to focus on 
finding underlying mental causes and to 
interpret these as constitutional and likely 
to be chronic. 13 DSM-III's move towards 
specific diseases and to focus on symptoms 
rather than underlying causes weakened 
these imperatives. 

Max Hamilton was born in Offenbach, 
near Frankfurt, in 1912, and his parents 



moved to London in 1915. 14 He qualified 
in medicine at University College Hospital 
London in 1934 and worked in a number 
of posts before settling upon psychiatry in 
1946, when he joined the Maudsley 
Hospital in London. He worked at various 
London hospitals and began an association 
with Cyril Burt that led him to develop 
expertise in, and an almost missionary 
commitment to, psychometrics, which was 
fashionable in the psychological sciences in 
the 1950s. In 1953, he moved to the 
University of Leeds as lecturer in psych- 
iatry. He found little time for research and 
in 1957 resigned to take up a temporary, 2- 
year research position in the University. 
This was funded by research grants from 
the Mental Health Research Trust and by 
a trial that his head of department, Ronald 
Hargreaves, was running on chlorpromaz- 
ine. In this work, Hamilton developed a 
number of scales, the first in 1957 in a 
study with Hargreaves on the value of 
Benactyzine in the treatment of anxiety, for 
which drugs and placebos were supplied by 
Glaxo. 15 The anxiety scale, later termed 
HAMA, anticipated many of the features 
of HRSD. 

We therefore classified all the symptoms 
likely to be found in our patients under the 
following headings: (1) anxious mood; (2) 
tension; (3) specific fears and phobias; (4) 
sleep disturbance; (5) intellectual disturb- 
ance; (6) depressive features; (7) somatic 
disturbances (muscular and sensory); (8) 
cardiovascular disturbance; (9) respiratory 
disturbances; (10) gastro-intestinal disturb- 
ances; (11) genitourinary disturbances; (12) 
autonomic disturbances and (13) manifest- 
ations of anxiety in the behaviour at the 
interview. A gloss was prepared listing the 
features to be taken into account in making 
an assessment under any of these headings. 
At the interview we rated each of these 
thirteen items on a five-point scale as 
follows: 0, none; 1, mild; 2, moderate; 3, 
severe; 4, grossly disabling. This rating 
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scale yields a variety of different types of 
information for each patient, including a 
"profile" of his symptomatology and, by 
summing the ratings for all headings, a 
gross symptom score. 15 

One conclusion of this study was that 
'impressionistic global judgments of a 
patient's condition alone are of little value 
in assessing the effect upon him of a par- 
ticular regime'. Hamilton had previously 
spoken on the use of scales in this work on 
anxiety at the British Psychological Society 
in 1956. 16 In what became a feature of his 
publications of scales, he devoted much of 
the paper to sophisticated statistical testing 
of reliability and reproducibility. As noted 
already, HAMA was further elaborated in 
the 1960s but did not have the success of 
HRSD, but that is a topic for another paper. 

The first iteration of the HRSD scale was 
actually published in 1959, in an article co- 
authored with Jack White, a consultant 
psychiatrist at the Stanley Royd Hospital, 
Wakefield. 17 The famous 1960 paper was 
already in press and mentioned, though 
without a citation. The scale in the 1959 
paper offered a different and more finely 
grained classification of patient symptoms, 
moving away from the three accepted 
dichotomies: Reactive - Endogenous; 
Agitated - Retarded; Neurotic - Psychotic. 
Hamilton and White subjected patient's 
scores on their schema to factor analysis 
and identified four groups of patients and 
types of depression: Endogenous, Doubtful 
Endogenous, Doubtful Reactive and 
Reactive. In other words, they were using 
the scores for the classification of different 
types of depression. In conclusion, they 
argued that, with the range of therapeutic 
options increasing as new drugs were added 
to ECT and psychotherapy, it was import- 
ant for psychiatrists to be better able to 
differentiate forms of depression and their 
response to treatments. The study was of 64 
male patients at Stanley Royd and included 
an Appendix of case histories of 20 patients, 



which showed that they had received a 
variety of treatments. Of the 20, 16 had 
received ECT, so the origins of HRSD lie in 
charting the dominant therapeutic regimes 
of the era and were not only developed for 
pharmaceutical treatment. 

What became known as HRSD was 
proposed by Hamilton in his now famous 
and much cited 1960 paper? His stated aim 
was to improve upon existing scales, which 
he criticised for being inappropriate, unreli- 
able or using ill-defined symptoms. 4 His new 
scale was to be used in interviews conducted 
by psychiatrists and was intended for 
patients already diagnosed with depression. 
It relied mostly on the observations of 
bodily (somatic) and behavioural features 
by psychiatrists, which were also weighted 
more heavily than the few symptoms that 
relied on patient's reports of their feelings 
(Figure 1). 

The empirical basis of the paper was 
drawn from 49 of the 64 patients discussed 
in the 1959 paper. There were 17 variables in 
the new scale, each rated on either a four- or 
two-point range, which produced a potential 
maximum of 50 points for extremely severe 
illness. The recommendation was that two 
psychiatrists interview the patient separately 
and their scores be added together to give a 
rating out of 100 (Figure 2). The correlation 
between the scores of the two scorers (pre- 
sumably Hamilton and White) was found to 
be high and to improve with experience. 

In discussing individual patients, 
Hamilton did not use their overall rating 
score; instead he gave their pattern of factor 
measures in terms of the four diagnostic 
groups identified in the 1959 paper with 
White: Factor 1: Endogenous, Factor 2: 
Doubtful Endogenous, Factor 3: Doubtful 
Reactive and Factor 4: Reactive. 17 Figure 3 
presents the description of one of the 
patients whose profile was predominantly 
Factor 1 and this ends with the classification 
of his illness as 'endogenous' and seemingly 
chronic and likely to relapse. 
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A RATING SCALE FOR DEPRESSION 

BY 

MAX HAMILTON 

From the Department of Psychiatry, University of Leeds 



The appearance of yet another rating scale for 
measuring symptoms of mental disorder may seem 
unnecessary, since there are so many already in 
existence and many of them have been extensively 
used. Unfortunately, it cannot be said that per- 
fection has been achieved, and indeed, there is 
considerable room for improvement. 



appear in different settings. Other symptoms are 
difficult to define, except in terms of their settings. 
e.g., mild agitation and derealization. A more 
serious difficulty lies in the fallacy of naming. For 
example, the term "delusions" covers schizophrenic, 
depressive, hypochrondriacal, and paranoid de- 
lusions. They are all quite different and should be 

t lit :_! I A ,| Alcn — 1,.. 



Figure I . Hamilton's now famous paper on rating scales for depression was published in a little known 
journal. 4 



APPENDIX I 
ASSESSMENT OF DEPRESSION 



Score 
Range 



0 4 

0-4 
0-4 

o-: 
04 

0-2 
0-4 
0-4 

o : 

04 
04 

M 
M 
o-: 
04 
0-2 
o-: 



0-4 
0-4 
0-2 



Symptom 



Depressed mood 

Guilt 

Suicide 

Insomnia, initial 
middle 
., delayed 
Work and interests 
Retardation 
Agitation 
Anxiety, psychic 
M somatic 
Somatic sympioms. gastrointestinal 

„ „ general 

Genital symptoms 
Hypochondriasis 
Loss of insight 
., „ weight 

Diurnal variation ' ^ 

Depersonalization, etc. 
Paranoid symptoms 
Obsessional sympioms 



Score 



Crudim 



■ Moderate 



0 Absent 

1 Mild or trivial 

!}• 

4 Severe 

0 Absent 

1 Slight or doubtful 

2 Clearly present 



Figure 2. The first published iteration of what became HAM-D or HR.SD. 



Hamilton made clear the importance of 
factor scores and their value over the clas- 
sical clinical categories. In summary, he 
wrote: 

A rating scale is described for use in assess- 
ing the symptoms of patients diagnosed as 
suffering from depressive states. The first 
four latent vectors of the intercorrelation 
matrix obtained from 49 male patients are 
of interest, as shown by (a) the factor 
saturations, (b) the case histories of patients 



scoring highly in the factors and (c) the 
correlation between factor scores and out- 
come after treatment. The general problem 
of the relationship between clinical syn- 
dromes and factors extracted from the 
intercorrelations of symptoms is discussed. 4 

There is no evidence in the paper that 
'before and after' treatment scores were 
taken, the only link to treatment seems 
to be that the initial factor scores were 
indicative of the outcome of (mostly ECT) 
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Factor 1.— A man aged 39 years (Case 39) had factor 
scores of F, 76, F, 37, F, 49, and F, 52. 

This patient was admitted to hospital after two 
attempts at suicide, first by electrocution, and, when this 
failed, by an overdose of phenobarbitone. No psycho- 
logical precipitating factors were found. On admission 
he was severely depressed and still actively suicidal. He 
had strong feelings of guilt, and feared that he had 
acquired venereal disease and was infecting others with 
it. He was markedly retarded and showed loss of 
insight. His sleep was disturbed in all three phases, he 
had no interest in anything and had complete loss of 
libido since the onset of his illness four months pre- 
viously. His symptoms cleared with six courses of 
electroshock treatment (E.C.T.). Two weeks later he 
suddenly relapsed and attempted to cut his wrists with 
a broken tumbler. He again recovered with a further 
course of E.C.T. and has remained well ever since. 

This case was one of classical endogenous depression. 

Figure 3. An example of the case histories and commentaries included in Hamilton's I960 paper. 4 



treatment, hence, this first presentation of 
HRSD can be read as offering a more refined 
diagnosis or prognosis. In another paper 
with Jack White, also published in 1960, 
Hamilton assessed ratings as an indicator 
of the outcome of depression treated 
with ECT. 18 

The first published trial to use HRSD was 
a study of the use of the new drug amitrip- 
tyline by CG Burt and colleagues at the 
Royal Park Hospital in Melbourne, 
Australia. 19 For each patient an aggregate 
score out of 50 was first used to group 
patients; there was no factor analysis. 

After initial evaluation on Hamilton's 
(1960) scale for quantifying depressive 
illnesses, patients were allocated to one of 
four groups delineated on the basis of two 
leading prognostic criteria, age and sever- 
ity of illness. "Mild young" depressives 
were aged between 30 and 49 and, out of a 
possible maximum score of 100, had total 
scale scores below 40; "young severe" 
depressives were between 30 and 49 and 
had total scale scores above 40. Similar 
criteria of severity were used in the "old 



mild" and "old severe", who were aged 
between 50 and 70. 

The same overall rating score was used to 
assess the outcome after one and then four 
weeks treatment with amitriptyline com- 
pared to imipramine; the latter being the 
market leader for severe depression. The 
Table and Chart below show the range in 
individual rating scores and aggregates for 
the 'old severe' group. In fact, this was one 
of the few studies in the period that pre- 
sented the symptom scores separately, typ- 
ically the single aggregate score out of 50 or 
100 was used (Figure 4). 

In their discussion, Burt et al. made two 
key points about the HRSD that were, and 
are still, widely stated to account for its 
widespread use: (1) it was 'simple to use and 
rapidly completed' and (2) it could map 
changes that drugs brought in specific symp- 
toms. Burt and his colleagues wrote of 
'target' symptoms, which was perhaps an 
implicit comparison to the blunderbuss of 
ECT and its impact on the whole psyche. 
HRSD could certainly also map the 



Worboys 



209 



Table II 

Mean Improvement Scores on Physicians' Scale Ratings of Total Group 



Specific 


Possible 


A f » 1 

Alter 1 


Week 


Alter 4 


Weeks 


Depressive 


Range of 


Amitrip. 


* * 
Imip. 


Amitrip. 


Imip. 


Symptom 


Scores 


tl—SI 


n= jo 


n= 37 


n= 36 


Depressed Mood . . . 


0-8 


14 


1-2 


3-2 


2-4 


Retardation . . . . . 


0-8 


•7 


•7 


20 


1-7 


Works & Interests . . 


0-8 


19 


18 


3-6 


2-9 


Guilt 


0-8 


8 


•8 


21 


1-5 


Suicide 


0-8 


21 


21 


2-9 


2-5 


Psychic Anxiety 


0-8 


16 


16 


2-8 


2-4 


Somatic Anxiety . . 


0-8 


18 


11 


19 


2 0 


Hypochondriasis 


0-8 


•7 


1 0 


14 


M 


Agitation 


0-4 


1-2 


1-2 


1-7* 


1-2 


Initiil Incrtmnia 
initial illbuillllla . . 


n a 


i i 




">.">* 

A\ A\ 


1 .7 
1 / 


Middle Insomnia . . 


0-4 


•7* 


- 1 


13 


•8 


Delayed Insomnia . . 


0-4 


9t 


-•3 


1-6* 


•7 


Somatic Symptoms 












(Gastro-Int.) 




1-2 


•9 


2-4t 


1-5 


Somatic Symptoms 












(General) 


0-4 


■8 


10 


1-8 


1-5 


Genital Symptoms . . 


0-4 


■9 


-7 


16* 


9 


Loss of Weight 


0-4 


0 


0 


19 


1-6 


Loss of Insight 


0-4 


•3 


•1 


•7 


•3 


Total Improvement Score . 


0-100 


17-9 


14-7 


35-2* 


26-7 



* Significant at 5 per cent, level, 
t Significant at 1 per cent, level. 



MEAN ONE-WEEK IMPROVEMENT -*0LD SEVERE* CROUP (n-3S) 



3-0 T 




Fig. 2. 



Figure 4. Burt CG, et al. Amitriptyline in depressive states: a controlled trial. Br J Psychiatry 1962; 108: 
71 1-730. 
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temporal and experiential dimensions of 
treatments that were difficult to collect 
from patients after ECT. Fritz Freyhan, 
Clinical Director, Director of Research, 
Delaware State Hospital, Farnhurst, 
Delaware, explained this point in 1960, 
showing how drug treatments could be 
combined with psychotherapy. 

The pharmacological treatment of depres- 
sions offers this immense psychological 
advantage: the patient maintains his 
experiential continuity. The amnestic syn- 
drome associated with ECT, to which 
many attributed therapeutic significance, 
proves to be quite superfluous as is seen in 
successful pharmacotherapy. The preser- 
vation of experiential continuity has vast 
implications for psychotherapy. Until now, 
psychotherapy either followed ECT or had 
to be limited to patients who seemed 
capable of affective contact and of self- 
control over suicidal impulses. With ECT, 
the patient remains physically and emo- 
tionally passive. His recovery comes, as it 
were, from without. Pharmacotherapy 
makes him a participating partner. This 
offers psychotherapy entirely new oppor- 
tunities to involve the patient in the thera- 
peutic process until recovery is seen as 
coming from within. 20 

The second study to use the scale, albeit 
casually and with crude aggregate scores, 
was by AA Robin and J Harris at Runwell 
Hospital, Essex, in a comparison of imipra- 
mine and ECT. 21 In this study, as in many 
others at this time, ECT was found to give 
better outcomes. 

In 1963, JT Rose published a study of 
patient responses to ECT using HRSD. 22 In 
measuring the impact of therapy, he vali- 
dated HRSD by the fact 'that a drop in the 
score corresponded in the great majority of 
cases with improvement as recorded by 
overall clinical assessments and with falling 
scores in the occupational therapy ratings.' 
This is interesting as Hamilton developed his 
scale because of his dissatisfaction with 



overall clinical assessments and other 
scales. Cross reference to, and validation 
against, overall clinical assessment was 
common in discussions of HRSD through- 
out the 1960s and 1970s, not least because 
the scale was about changing qualitative 
judgments of clinical outcomes into quanti- 
tative values, either in a single score or a 
matrix of scores. 

Interestingly, HRSD was not used in 
1964-1965 in a major clinical trial on treat- 
ments for depressive illness organised by the 
Clinical Psychiatry Committee of the 
Medical Research Council (MRC), even 
though Hamilton played a leading role in 
the scheme. 22 The trial used both an overall 
clinical rating of severity and its own scale of 
15 symptoms: depressed mood, psycho- 
motor retardation, suicidal ideas, ideas of 
bodily change, ideas of reference, self- 
reproach, anxiety, insomnia (early, middle, 
late) anorexia and fatigue. This scale bore a 
close relation to HRSD in both the symp- 
toms monitored and the range of scoring, 
giving tacit endorsement to Hamilton's 
approach if not his particular scale. In fact, 
the Committee invented its own so-called 
'MRC Scale', which was used quite widely 
for a number of years, but fell away as 
HRSD took centre stage. 

That the uptake of HRSD was relatively 
slow is borne out by the number of publica- 
tions in which it was cited in its first 20 years, 
see Figure 5, which is presented with all the 
usual caveats about citations and what they 
mean. Two sets of data are given: the 
number of articles each year citing 
Hamilton's 1960 paper and the number of 
papers cited with 'depression' in the title. 
There is steady growth in the number of 
papers citing HRSD, but this is slower than 
the overall growth of citations on depres- 
sion, bearing in mind that both were 
influenced by the increase in the number of 
medical journals and the drive to publish 
more and often. Also, there were many 
publications, particularly at the end of the 
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Figure 5. Number of articles each year citing Hamilton M, A rating scale for depression, J Neurol Neurosurg 
Psychiatry 1 960; 23: 56-62; and number of articles cited with "depression" in the title. Source: Web of Science. 



1970s, in which HRSD was used without 
citing the 1960 paper. Perhaps it was too 
well known to need citing? Perhaps the 
absence of citation indicated that it was 
being used only casually? And, of course, 
citation did not mean that authors followed 
Hamilton's protocols, in fact psychiatrists 
used HRSD selectively and flexibly. Writing 
in 2001, Jane Williams observed that 
over time, 

Several versions of the scale had come into 
use, with differences in their total number 
of items, their anchor descriptions, their 
item interpretations and their scoring con- 
ventions .... By 1990 there were so many 
versions of the HAM-D that researchers 
and clinicians had lost track of what was 
available, and what were the characteristics 
of each one. No single version of the 
HAM-D or single set of conventions has 
been universally accepted. 1 1 

Williams noted that by this time, in 
different publications the number of items 
scored as HRSD had risen from 17 to 59. 11 



For much of the 1960s, HRSD was 
discussed as just another rating scale. For 
example in 1965, Gerald Klerman and 
Jonathan Cole's review of imipramine and 
related antidepressants mentions HRSD 
three times in different contexts and always 
in relation to other scales. 23 

Phenomenological differentiations of 
depressed patients have been developed, 
using symptom patterns and clusters 
derived by multivariate statistical tech- 
niques. Grinker et al., Friedman et al., 
Hamilton and Wittenborn et al. have pub- 
lished promising findings. 
For example, in studying hospitalized 
patients, especially severely depressed or 
schizophrenic patients, well validated 
scales, particularly by Lorr, Wittenborn, 
Hamilton and others are widely used. 
Instruments for nursing observations and 
for patients' self-ratings also have been 
developed. 

Drug-placebo differences were revealed by 
global estimates of degree of depression 
and by ratings of specific symptoms like 
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anxiety, insomnia, weight gain and guilt. 
Hamilton's rating scale, Lorr's Inpatient 
Multidimensional Psychiatric Scale and 
the Wittenborn Psychiatric Scale were 
sensitive to differences in most studies in 
which they were employed. 23 

This illustrates Martin Roth's statement 
in his brief biography of Max Hamilton that 
'It took more than a decade before the 
HRSD scale was recognised as a major 
contribution to knowledge and clinical 
practice.' 24 

Healy suggests that one reason HRSD 
was widely used is that it gave particular 
weight to anxiety symptoms, and thus was 
good at charting the positive effects of drugs, 
like imipramine, that were anxiolytic. Alan 
Broadhurst, a pharmacologist, who was in 
the group at Geigy that discovered imipra- 
mine told David Healy that, 'Max Hamilton 
was excited about imipramine and it cer- 
tainly did fit in beautifully with his rating 
scale. Years later he still referred to it as a 
happy coincidence'. 8 However, therapeutic 
regimes change for so many reasons that it is 
difficult to tease out the relative importance 
of HRSD relative to other factors and, 
although I do not have the data, it is likely 
that the uptake of imipramine was more 
rapid than that of HRSD. 25 

An alternative approach to assessing the 
rise of HRSD is to look at when and how it 
was criticised, and why these objections did 
not impede its progress to becoming the 
'Gold Standard.' In the 1960s, HRSD had a 
competitor, the Inventory for Measuring 
Depression (then ID and now Beck depres- 
sion inventory (BDI)), proposed by Aaron T 
Beck at the University of Pennsylvania. 26 
BDI has proved similarly enduring and also 
had the advantage of being a 'first' and the 
one against which other scales were cali- 
brated and validated. Beck was a pioneer of 
cognitive therapy and his scale was quite 
different to HRSD in being based on a 
patient's self-rating. In its original form the 
BDI consisted of 21 questions, each with 



four possible answers that the patient had to 
rate 0-3. This gave a theoretical maximum 
score of 63. A score above 30 indicated 
severe illness, 19-29 moderate, 10-18 mild 
and below 10 minimal. A common way of 
contrasting BID with HRSD was to say that 
it was 'subjective': it relied upon patients' 
thoughts and feelings, while HRSD was 
'objective', because it was mainly based on 
clinician observations of bodily and behav- 
ioural symptoms. 

In 1965, Maryse Metcalfe and Ellen 
Goldmann compared HRSD favourably 
with BDI, though they acknowledged that 
it depended on the skill of the rater and their 
clinical bias, which, they cautioned, 'made it 
somewhat difficult to compare meaningfully 
results obtained in different investiga- 
tions.' 27 In their view, the advantages of 
BDI were that it was simple, quick and easy 
to administer, and 'independent of doctors' 
and nurses' bias, seemingly relying on the 
'constant' of the patient. In 1967, John 
Schwab and colleagues, at the University 
of Florida College of Medicine, published a 
comparison of HRSD and BDI amongst 
ordinary and, one must assume, mostly non- 
depressed medical inpatients. 28 ' 29 They 
found a good correlation (;- z = 0.75) in 
scores, but argued that the two scales were 
complementary because they measured 'dif- 
ferent components of the depressive 
complex.' 

Hamilton assessed and offered a further 
elaboration of his own scale in 1967. 30 The 
second paper was largely methodological, 
though it did consider a larger patient group 
and females as well as males. He also added 
four extra symptoms to score. However, the 
article was not easy reading for his peers. It 
was highly mathematical, as the Abstract 
illustrates. 

'This is an account of further work on a 
rating scale for depressive states, including 
a detailed discussion of the general prob- 
lems of comparing successive samples from 
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Figure 6. Number of article each year citing Hamilton M, A rating scale for depression. J Neurol Neurosurg 
Psychiatry I960; 23: 56-62 and Hamilton M, Development of a rating scale for primary depressive illness. Br J 
Soc Clin Psychol 1967; 6: 278-296. Source: Web of Science. 



a 'population', the meaning of the factor 
scores, and the other results obtained. The 
intercorrelation matrix of the times of the 
scale has been factor-analysed by the 
method of principal components, which 
were then given a Varimax rotation. 
Weights are given for calculating factor 
scores, both for rotated as well as unro- 
tated factors. 30 

The data to the end of 1990 (Figure 6) 
shows that, if citations in any way indicate 
the resources used by psychiatrists in their 
work, that they stuck with the 1960 paper, 
for the later elaboration was cited less, even 
allowing for lags. 

In his 1967 paper, Hamilton noted, in a 
very revealing statement, that this study had 
been difficult because of the time taken to 
accumulate a sufficient number of patients 
with depression. What he actually meant 
was the difficulty in finding appropriate 
patients, that is, those with treatable illness, 
as he contrasted this difficulty with the ease 
of earlier studies with patients in mental 
hospitals where there were large numbers of 
chronic cases. 30 It seems that within a 



decade, what counted as depression, along 
with who and how they suffered, had 
changed. 

I now want to jump another ten years and 
consider the ways that HRSD was being 
used in therapeutic trials at the end of the 
1970s. 31 By this time almost all trials were 
with psychopharmaceuticals, though ECT 
was still being used for patients diagnosed 
with 'severe' depression. In fact, prior treat- 
ment with ECT often excluded patients from 
participation in drug trials. However, 
HRSD was still being used in assessments 
of ECT, as well as psychotherapy. 32 And in 
1977, it was even used by Aaron Beck to 
compare 'pharmacotherapy' and 'cognitive 
therapy,' see Figure 7. 33 

To sample the uses of HRSD, I surveyed 
all of the clinical trials for depression pub- 
lished in the medical journals listed in Web 
of Science for 1979. It was impossible to 
produce reliable quantitative data of the 
series, because of the different drugs, proto- 
cols and citation practices, so I have chosen 
to discuss articles that are representative. In 
most trials HRSD was used with another 
scale and sometimes with multiple scales, as 
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Table IV. Clinical Ratings of Severity of Depression at Initiation and Termina 

tion of Treatments 0 



Hamilton Rating Scale for Depression 







Completers only 


Completers plus dropouts 






Cognitive 




Cognitive 




Time of 




therapy 


Pharmacotherapy 


therapy 


Pharmacotherapy 


assessment 




(/V = 15) 


(N ^ 14) 


(N ■ 16) 


(/V = 20) 


Pretreatment 


X 


21.20 


22.43 


20.94 


21.95 




cn 


3.34 


4.24 


3.40 


4.27 


Posttreatment 


x 


5.80 


9.29 


6.25 


10.10 




SD 


3.67 


6.62 


3.98 


5.94 








Raskin Depression Rating Scale 






(N = 14) 


(/V= 10) 


(N=15) 


(/V = 14) 


Pretreatment 


X 


9.86 


10.20 


9.93 


9.86 




SD 


1.75 


.92 


1.71 


1.41 


Posttreatment 


X 


3.93 


5.80 


4.20 


7.10 




SD 


1.44 


3.49 


1.82 


3.48 



a Reduced jVs are due to missing pre- or posttreatment scales on some patients. 



Figure 7. An example of the reporting outcomes of the use of HR.SD with another scale and for different 
treatments. 33 



in the report of a controlled trial of trimi- 
pramine and monoamine oxidase inhibitors 
at St Thomas's Hospital, London, published 
in 1979. The authors stated: 

The patients completed the Beck scale 
for depression and the Middlesex Hospital 
Questionnaire (MHQ), and were rated 
blindly by an independent assessor on the 
Hamilton rating scale for depression, the 
MRC depression scales, and an overall six- 
point rating of the severity of depression. A 
standard rating of side effects was com- 
pleted by the psychiatrist who regulated 
drug dosage to prevent knowledge of any 
such effects biasing the clinical ratings of the 
other assessor. 34,35 

The graphs below show how the results of 
the different scales were mapped for the six 
weeks of the trial (Figure 8). 



The same pattern was evident in a study 
of Limbitrol in California. 

The patients were evaluated at base- 
line using the Hamilton rating scale for pri- 
mary depressive illness (HDS) and 
the Covi anxiety scale. In addition, 
the patients completed the short form 
of the BDI and the Hopkins symptom 
checklist (SCL-58). Efficacy was assessed 
at follow-up visits after 1, 2, and 4 weeks 
of treatment by the physician, using the 
HDS and a global evaluation, and by the 
patient using the BDI, the SCL-58, and a 
global evaluation. In most instances, the 
BDI and the SCL-5g were completed by 
the patient prior to his seeing the 
psychiatrist. 36 

In a trial of Lithium, HRSD was 
set against a 5-point nurse rating scale 
(Figure 9). 37 
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(□ □) for six weeks. 

Figure 8. An example of HRSD scores reported against many other scales. 34 



There are very few publications where the 
score was disaggregated and the different 
components mapped to identify specific 
changes, one exception was a study compar- 
ing amineptine and amitriptyline at Hopital 
de St. Germain-en-Laye. 38 The changes in 
the total scores were first presented 
(Figure 10) and when the component 
scores were set out it was difficult to see 



the wood for the trees (Figure 11), and then 
only 14 out the 26 items scored had statis- 
tical significance. 

Discussion 

In this paper, I have made two main claims, 
first that HRSD was applied by clinicians to 
construct depression as a time-limited 
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Figure 9. An example of reporting HRSD in comparison with a nurse rating scale. 37 



illness, and second, that this influential 
framing of the condition was used alongside 
other scales and only rose to dominance 
gradually. The assumption of the time- 
limited illness supports the claim of Healy 
and others that an HRSD-structured char- 
acterisation of depression was suited to drug 
therapy and the interests of pharmaceutical 
companies in the 1960s and 1970s. The view 
of psychiatrists in the first half of the 
twentieth century was that depressive 
mental illness was chronic, either because 
of patient susceptibilities rooted in somatic 
factors, such as hereditary or physical dis- 
ease, or in psychic variables influenced by 
upbringing, interpersonal relationships or 
personality. There was however some turn- 
over in mental hospital patients and moves 



to treat many sufferers as out-patients. The 
patient population peaked in Britain in 1954 
at 140,000, when there were 121,000 beds, 
suggesting that turnover was not great and 
that most patients had chronic conditions. 
The rundown in the number of beds and the 
move to community care saw depression 
move out of the hospital and into the 
community, as an out-patient or general 
practitioner managed condition. In this 
setting, and due to new framings and new 
treatments, it was approached as a 'mild' 
and short-lived condition, at least compared 
to the illness that had previously required 
hospitalisation. 39 HRSD was used to frame 
this new 'depression' and its sufferers, 
normalising it to the ways of seeing and 
treating illness as a treatable episode or 



Worboys 



217 



Table III. Mean (±S.D.) total scores on Hamilton Rating Scale before and during treatment 



Assessment 


Amineptine 


Amitriptyline 


Before treatment 


50.92±I.18 


52.12±1.42 


After 15 days 


49.24±1.I8*» 


50.80±1.JJ* 


After 30 days 


41.62±1.12"» 


43.00 ±1.43*»« 


After 45 days 


36.54±1.25"» 


38.22±1.48*'» 



Significance of difference from More treatment •p<0.05, •*p<c0.0l, ***p<0.00l or less 
Figure 10. A typical use of HRSD charting the effects of two drugs over time. 38 



Title IV. Chios?! In man 


:±SJ>.) scorn for Indindua] itei 


M of Hamilton Ratios Scale daring treatment 






Item 


Drug 


Before 


After treatment 










treatment 














Day 13 


Day 30 


Day 45 


1. Depressed mood 


Amineptine 


3.20±O.II 


3.08*0.09' 


141 ±0.11"' 


I.79±0.I3'" 




Amitriptyline 


1.20*0.10 


3.20*0.13 




1 A4 1 A 1 1ft itl 


2. Guill feelings 


Amineptine 


1.60*0.14 


1.604:0.14 


1.33 ±0.09* 


l.2O±O.08* M 




AIUIU J lliJC 


l.96±0.14 


1.92±0.14 


1 .*tv TV.1 1 


1 .Li. t U. 1U 


3. Suicide 


Amincpcfnc 


1.76*0.15 


1.64*0.16 


l.29±0.1I'» 


1.04*0.04'" 




Amitriptyline 


1.84*0.13 


l.76±0.16 


1.20*0.09"' 


1.11*0.08'" 


4. MM insomnia 


Amineptine 


120*0.12 


120*0.12 


1.91*0.13'" 


1.75*0.12'" 




Amitriptylioe 


? 24 ■ 0 1 2 


116*0.11 


1.90±0.10*« 


1.89*0.11" 


5. Middle insomnia 


Amineptine 


1.60 * 0.10 


1.60*0.10 


1 50:010 


1.25*0.09"' 




Amitriptyline 


:.S4iOI2 


1.80*0.12 


1.45*0.14'" 


1.22*0.10"' 


6. Delayed insomnia 


Amineptine 


I.32±0.09 


1.24*0.08 


1.12*0.06 


1.00*0.01'" 


(early morning waking) 


Amitriptyline 


1.44*0.13 


1.32*0.11 


1.25*0.12 


1.11*0.08'" 


7. Work and activity 


Amineptine 


3.40*0.14 


3.08*0.17'" 


158*0.15'" 


116*0.16'" 




Amitriptyline 


3.44*0.19 


3.28*0.17" 


150*0.17'" 


116*073"' 


8. Retardation 


Amineptine 


:.96±0.14 


1.92*0.11 


1.58*0.13'" 


1.20*0.08'" 




Anutriptyline 


100*0.15 


100*0.15 


1.60*0.16 


l.27±0 15'" 


9. Agitation 


Amineptine 


1.64*0.11 


1.76*0.10 


1.41*0.10 


1.16*0.07"' 




Amitriptyline 


1.68*0.12 


1.72*0.13 


1.30*0.10*" 


1.11*0.07"' 


10. Psychic anxiety 


Amineptine 


3.36*0.14 


3.16*0.16" 


137*013"' 


1.70*0.11"' 




Amitriptyline 


3.20*0.14 


3.04*0.17 


2.35*0.16'" 


100=0.11"' 


11. Somatic anxiety 


Amineptine 


136*0.15 


136*0.15 


1.91*0.11'" 


1.79*0.12'" 




Amitriptyline 


128 ±017 


128*0.16 


2.10*0.14" 


1.94*0.15'" 


12. Gastro-intestinal 


Amineptine 


1.6810.09 


1.68*0.09 


1.45*0.10" 


1.33 *0.09"' 


symptoms 


Amitriptyline 


l.68±0.13 


1.68*0.13 


1.65*0.15 


1.50*0.16" 


13. General somatic 


Amineptine 


;.52±0.I0 


1.32*0.10 


1.45*0.10 


1.29*0.09"' 


symptoms 


Amitriptyline 


1.64±0.09 


1.64*0.09 


1.50*0.11 


1.33*0.11'" 


14. Genital symptoms 


Amineptine 


180*0.21 


180*0.21 


2.79*0.23 


2.58-074' 




Amitriptyline 


160 rOA9 


176*0.20 


2.90*0.17 


2.83*0.23 



Figure I I. Reporting disaggregated HRSD scores, as illustrated above, became less common.' 



episodes, rather than a life course condition. 
As such, HRSD served the interests of 
psychiatrists and psychiatry in the new era 
of treating specific illnesses outside of 
mental hospitals. 

HRSD rose to dominance 'from below.' 
When it was sanctioned 'from above' in the 
1980s, by the World Health Organisation, 
Food and Drugs Administration, and other 



medicine licensing agencies, this was 
acknowledging its widespread use, not creat- 
ing it 'top down.' Paradoxically, the even- 
tual dominance of HRSD was in large part 
due to its successful validation against the 
holistic clinician assessments, the very thing 
Hamilton designed it against. However, 
HRSD was a clinician scoring instrument 
and proved simple to use because clinicians 
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made it so, choosing overall scores rather 
than disaggregated or factor scores. In many 
ways, the 'S' in HRSD stood for 'Score' not 
'Scale', but either way it was a quantitative 
datum on a relatively large and finely 
grained scale of 100, at least when compared 
to the previous clinician scales. Overall, 
HRSD was a strange kind of 'standard,' 
being quite non-standard in the flexible and 
widely varying ways it was used, the number 
and type of items in the scale and the 
meanings given to its findings. 
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